Article 7415

Title of the article

AN ALGORITHM FOR CONSTRUCTING A STATISTICAL DISCRETE-CONTINUUM DESCRIPTION OF SOUND DURATION OF ANNOUNCER’S MEANINGFUL SPEECH FLOW 

Authors

Boykov Il'ya Vladimirovich, Doctor of physical and mathematical sciences, professor, head of sub-department of higher and applied mathematics, Penza State University (40 Krasnaya street, Penza, Russia), boikov@pnzgu.ru
Ivanov Aleksandr Ivanovich, Doctor of engineering sciences, head of laboratory of biometric and neural network technologies, Penza Research Institute of Electrical Engineering (9 Sovetskaya street, Penza, Russia), ivan@pniei.ru
Kalashnikov Dmitriy Mikhaylovich, Postgraduate student, Penza State University (40 Krasnaya street, Penza, Russia), kalashnikovdm.penza@gmail.com

Index UDK

004; 519.7; 519.6; 519.66; 612.087.1

Abstract

Background. Main problems in development of algorithms and software for implementing voice authentication are the following: user's voice variations (voice can vary depending on health conditions, age, mood etc.); presence of a noise component. Solving these problems will allow to use the voice authentication technology to ensure the best protection of personal data, ease of usage. Besides, it is the cheapest among the existing identification technologies.
Materials and methods. The authors used numerical methods, digital signal processing, spectral methods, methods of mathematical statistics and time-series, as well as artificial intelligence and pattern recognition. A fragmentating device is based on a continuous-discrete model of speech processing, which, in combination with a narrow-band filter, allows to determine the average duration of sound.
Results. It is shown that the qualitative tone/noise speech qualifier should give the output of "0" and "1", the duration of which is described by the continuumdiscrete distribution of values of duration of the intervals between sections of tonal sounds, distributed according to normal laws. The discrete part of the distribution is formed by the discrete nature of the flow of tone sounds and noise appearance in speech, as well as their combinations (pairs, triples, quadruples, and etc). The continuous (continuum) part of sound length values distribution is conditioned by the instability of speech by changing a pronunciation pace. The article describes a method of calculating the average length of one sound of a meaningful speech. This study has allowed to build a machine to determine the average length of sound in different parts of an audio signal.
Conclusions. The article suggests a numerical algorithm for identification of individual speaker’s speech, allowing to sync speech areas. Usage of the developed algorithm has allowed to specify parameter values that characterize the statistical description of duration of the intervals between speech sounds and noise between tonal sounds. The study has made it possible to build a machine to determine the average length of sound in different parts of an audio signal. The results are the basis for building neural network authentication technologies.

Key words

numerical methods, digital signal processing, biometric systems, voice authentication.

Download PDF
References

1. Dodis Y., Reyzin L., Smith A. EUROCRYPT. 2004, April 13, pp. 523–540.
2. Monrose, F., Reiter M., Li Q., Wetzel S. Proc. IEEE Symp. on Security and Privacy, 2001.
3. Yazov Yu. K., Volchikhin V. I., Ivanov A. I., Funtikov V. A., Nazarov I. G. Neyrosetevaya zashchita personal'nykh biometricheskikh dannykh [Neural network protection of personal biometric data]. Moscow: Radiotekhnika, 2012, 157 p.
4. Akhmetov B. S., Ivanov A. I., Funtikov V. A., Bezyaev A. V., Malygina E. A. Tekhnologiya ispol'zovaniya bol'shikh neyronnykh setey dlya preobrazovaniya nechetkikh biometricheskikh dannykh v kod klyucha dostupa: monogr. [Technology of using large neural networks for fuzzy biometric data conversion to access key codes:
monograph]. Almaty: Izd-vo LEM, 2014, 144 p. Available at: http://portal.kazntu.kz/files/publicate/2014-06-27-11940.pdf
5. Ramishvili G. S. Avtomaticheskoe opoznavanie govoryashchego po golosu [Automatic voice authentication of speakers]. Moscow: Radio i svyaz', 1981, 224 p.
6. Markel Dzh. D., Grey A. Kh. Lineynoe predskazanie rechi [Linear voice prediction]. Moscow: Radio i svyaz', 1980, 248 p.
7. Kantorovich L. V., G. P. Akilov Funktsional'nyy analiz [Functional analysis]. Moscow: Nauka, 1977, 750 p.
8. Solomina A. I., Ulakhovich D. A., Arbuzov S. M., Solov'eva E. B. Osnovy tsifrovoy obrabotki signalov [Basic digital signal processing]. Saint-Petersburg, 2013, 768 p.
9. Bakushinskiy A. B., Strakhov V. N. Zhurnal vychislitel'noy matematiki i matematicheskoy fiziki [Journal of calculus mathematics and mathematical physics]. 1968, vol. 8, no. 1, pp. 181–185.
10. Oblomskaya L. Ya. Zhurnal vychislitel'noy matematiki i matematicheskoy fiziki [Journal of calculus mathematics and mathematical physics]. 1968, vol. 8, no. 2, pp. 417–426.
11. Danford N., Shvarts Dzh. Lineynye operatory. T. 1 Obshchaya teoriya [Linear operators]. Moscow: IL, 1962, 895 p.
12. Lyusternik L. A., Sobolev V. I. Elementy funktsional'nogo analiza [Functional analysis elements]. Moscow: Nauka, 1965, 540 p.

 

Дата создания: 12.05.2016 11:33
Дата обновления: 12.05.2016 13:16